Hierarchical Classification for Spoken Arabic Dialect Identification using Prosody: Case of Algerian Dialects

نویسندگان

  • Soumia Bougrine
  • Hadda Cherroun
  • Djelloul Ziadi
چکیده

In daily communications, Arabs use local dialects which are hard to identify automatically using conventional classification methods. The dialect identification challenging task becomes more complicated when dealing with an under-resourced dialects belonging to a same county/region. In this paper, we start by analyzing statistically Algerian dialects in order to capture their specificities related to prosody information which are extracted at utterance level after a coarse-grained consonant/vowel segmentation. According to these analysis findings, we propose a Hierarchical classification approach for spoken Arabic algerian Dialect IDentification (HADID). It takes advantage from the fact that dialects have an inherent property of naturally structured into hierarchy. Within HADID, a top-down hierarchical classification is applied, in which we use Deep Neural Networks (DNNs) method to build a local classifier for every parent node into the hierarchy dialect structure. Our framework is implemented and evaluated on Algerian Arabic dialects corpus. Whereas, the hierarchy dialect structure is deduced from historic and linguistic knowledges. The results reveal that within HADID, the best classifier is DNNs compared to Support Vector Machine. In addition, compared with a baseline Flat classification system, our HADID gives an improvement of 63.5% in term of precision. Furthermore, overall results evidence the suitability of our prosody-based HADID for speaker independent dialect identification while requiring less than 6s test utterances. Email addresses: [email protected] (Soumia Bougrine), [email protected] (Hadda Cherroun), [email protected] (Djelloul Ziadi ) Preprint submitted to Elsevier March 30, 2017 ar X iv :1 70 3. 10 06 5v 1 [ cs .C L ] 2 9 M ar 2 01 7

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using prosody and phonotactics in Arabic dialect identification

While Modern Standard Arabic is the formal spoken and written language of the Arab world, dialects are the major communication mode for everyday life; identifying a speaker’s dialect is thus critical to speech processing tasks such as automatic speech recognition, as well as speaker identification. We examine the role of prosodic features (intonation and rhythm) across four Arabic dialects: Gul...

متن کامل

Dialect separation assessment using log-likelihood score distributions

Dialect differences within a given language represent major challenges for sustained speech system performance. For speech recognition, little if any knowledge exists on differences between dialects (e.g. vocabulary, grammar, prosody, etc.). Effective dialect classification can contribute to improved ASR, speaker ID, and spoken document retrieval. This study, presents an approach to establish a...

متن کامل

Arabic Dialect Identification

The written form of the Arabic language, Modern Standard Arabic (MSA), differs in a nontrivial manner from the various spoken regional dialects of Arabic – the true “native languages” of Arabic speakers. Those dialects, in turn, differ quite a bit from each other. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. In this article, we des...

متن کامل

Spoken Arabic Dialect Identification Using Phonotactic Modeling

The Arabic language is a collection of multiple variants, among which Modern Standard Arabic (MSA) has a special status as the formal written standard language of the media, culture and education across the Arab world. The other variants are informal spoken dialects that are the media of communication for daily life. Arabic dialects differ substantially from MSA and each other in terms of phono...

متن کامل

Prosody as a distinctive feature for the discrimination of arabic dialects

The aim of the work to be reported here is to explore the utility of prosodic information in language identification and discrimination tasks. The purpose of this study is to see whether prosodic patterns can be considered as reliable acoustic cues for the discrimination of Arabic dialects by investigating, via a perceptual experiment, if listeners are successful in identifying the Arabic diale...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1703.10065  شماره 

صفحات  -

تاریخ انتشار 2017